182
■Bio-mathematics, Statistics and Nano-Technologies: Mosquito Control Strategies
Figure 9.5: The clustering of natural and synthesized compounds (Syn1 – Syn13) with
repellent activity toward A. gambiae females (Thireou et al. 2018) based on calculated
physicochemical descriptors.
9.3.5.2
Principal component analysis
Along with cluster analysis, principal component analysis (PCA) is one of the most
often exploited pattern recognition method. If multicollinearity among the studied vari-
ables occurs, it is meaningful to apply PCA in order to reduce the data set and define new
principle variables. Scores plot and loadings plot are used to present the result of analysis
and they serve for the observation of the similar variables. The score plot of the PCA that
presents the distribution of the same set of compounds that was analyzed by HCA on the
basis of the same set of molecular descriptors is presented in Figure 9.7. The score plot
shown in Figure 9.7 indicates significant separation of the majority of synthesized com-
pounds along the PC1 axis that takes into account 47.34% of total variance. Most of the
natural repellent compounds are placed on the positive end on the PC1 axis. The distribu-
tion of the compounds along the PC2 axis, that covers 34.46% of total variability, is mostly
based on their lipophilicity that has the strongest influence on PC2 axis. The influence of
the descriptors on the compounds distribution is determined on the basis of the loadings
plot (not shown).